An Overview of the Penman Text Generation System

نویسنده

  • William C. Mann
چکیده

The problem of programming computers to produce natural language explanations and other texts on demand is an active research area in artificial intelligence. In the past, research systems designed for this purpose have been limited by the weakness of their linguistic bases, especially their grammars, and their techniques often cannot be transferred to new knowledge domains. A new text generation system, Penman, is designed to overcome these problems and produce fluent multiparagraph text in English in response to a goal presented to the system. Penman consists of four major modules: a knowledae acauisition module which can perform domain-specific searches for knowledge relevant to a given communication goal; a text planninq module which can organize the relevant information, decide what portion to present. and decide how to lead the reader’s attention and knowledge through the content; a sentence generation module based on a large systemic grammar of English; and an evaluatron and plan-oerturbation module which revises text plans based on evaluation of text produced. Development of Penman has included implementation of the largest systemic grammar of English in a single notation. A new semantic notation has been added to the systemic framework, and the semantics of nearly the entire grammar has been defined. The semantics is designed to be independent of the system’s knowledge notation, so that it is usable with widely differing knowledge representations, including both frame-based and predicate-calculus-based approaches. 1. TEXT GENERATION AS A PROBLEM Al research in text generation has a long history, but it has had a much lower level of activity than research in language comprehension. It has recently become clear that text generation capabilities (far beyond what can be done with canned text) will be needed, because Al systems of the future will have to justify their actions to users. Text generation capabilities are also being developed as parts of Instruction systems [Swartout 83a]. data base systems [McKeown 821, program specification systems This research was supported by the Air Force Cffrce of Screntific Research contract No F49620-79-C-0181. The views and conclusrons contained In this document are those of the author and should not be interpreted as necessarily representrng the officral poircies or endorsements. erther expressed or ImplIed, of the Arr Force Office of Screntifrc Research of the U.S Government [Swartout 82. Swartout 83b]. expert consulting systems [Swat-tout 811 and others. A group who recently assessed the state of the art in text generation ( [Mann 81a]) concluded that there are four critical technologies which will largely determine the pace of text generation progress in this decade: 1. Knowledge Representation 2. Linguistically Justified Grammars 3. Models of Text Readers 4. Models of Structures and Functions in Discourse The Penman system has distinct contributrng particularly to 2 and 4. roles for each of these. Penman is intended as a portable, reusable text generation facility which can be embedded in many kinds of systems By design. it is not tied to a single knowledge domain, to avoid the potential waste of effort inherent in developing single-domain systems whose domain-independent. reusable specific knowledge is not retained. Penman’s techniques are adequate to cover the data base domain of McKeown’s text generator [McKeown 821. Davey’s game transcripts domain [Davey 791. the crisis instructional domain of Mann and Moore [Moore & Mann 791 and others. 2. SYSTEh’I OVERVIEW Figure 2-l shows the principal data flows in Penman. The given goal controls both the search for relevant information (Acquisition) and the organization of that information (Text Planning). Plans are hierarchic, with plans for clauses or sentences at the finest level of detail. These plans include both the logical content to be expressed and how each unit leads the reader’s attention through the material. The sentence generator module (Sentence Generation) executes the most detailed level of the plan, thus producing a draft text. The evaluation and revision module (Improvement) evaluates the text, applying measures of quality and comparing the text with the plan for producing it. The module then produces perturbations in the plan to attempt to improve the text. A text is complete when the perturbations suggested by Improvement do not improve the value level identified by the Improvement module. The major knowledge resources of these modules are also indicated In the figure. In addition to the knowledge notation itself, there is a knowledge base for generic and concrete knowledge of the subject matter and its relation to the world in 261 From: AAAI-83 Proceedings. Copyright ©1983, AAAI (www.aaai.org). All rights reserved. general, a model of discourse, represented as a collection of patterns and rules which guide text planning. and a model of the reader. We describe the modules (in order of degree of development rather than in the data flow order described above) in the topical sections below.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The Penman Language Generation Project

The natural language sentence generation program Penman provides computational technology for generating English sentences and paragraphs, starting with input specifications of a non-linguistic kind. The research goals underlying Penman are threefold: to provide a useful and theoretically motivated resource for other research and development groups and the computational community at large, to p...

متن کامل

Imagene Penman

This study employs a knowledge intensive corpus analysis to identify the elements of the communicative context which can be used to determine the appropriate lexical and grammatical form of instructional texts. imagene, an instructional text generation system based on this analysis, is presented, particularly with reference to its expression of precondition relations.

متن کامل

An Overview of Hydroelectric Power Plant: Operation, Modeling, and Control

Renewable energy provides twenty percent of electricity generation worldwide. Hydroelectric power is the cheapest way to generate electricity today. It is a renewable source of energy and provides almost one-fifth of electricity in the world. Also, it generates electricity using a renewable natural resource and accounting for six percent of worldwide energy supply or about fifteen percent of th...

متن کامل

An Overview on Microgrid Concept with Special Focus on Islanding Protection Issues

Subscriber service is not feasible in the construction of large-scale traditional networks with the aim of providing more services. The high distance between production and consumption requires the definition of a transmission network as a challenging intermediary. The cost of transmission network and the risk associated with it cannot be ignored at all. The idea of a microgrid, which began wit...

متن کامل

A Flexible Interface for Linking Applications to Penman's Sentence Generator

The Penman text generation system has been used within several different experimental application domains, demonstrating that it provides the basis for an adaptable general purpose text generation capability. Linking with these applications also indicated several ways that Penman's interface with applications could be improved. Penman's interface with applications is described, focusing on SPL,...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1983